Statistical Properties of Factor Oracles

نویسندگان

  • Jérémie Bourdon
  • Irena Rusu
چکیده

Factor and suffix oracles have been introduced in [1] in order to provide an economic and efficient solution for storing all the factors and suffixes respectively of a given text. Whereas good estimations exist for the size of the factor/suffix oracle in the worst case, no average-case analysis has been done until now. In this paper, we give an estimation of the average size for the factor/suffix oracle of an n-length text when the alphabet size is 2 and under a Bernoulli distribution model with parameter 1/2. To reach this goal, a new oracle is defined, which shares many of the properties of a factor/suffix oracle but is easier to study and provides an upper bound of the average size we are interested in. Our study introduces tools that could be further used in other average-case analysis on factor/suffix oracles, for instance when the alphabet size is arbitrary.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constructing Factor Oracles

A factor oracle is a data structure for weak factor recognition. It is an automaton built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, and has m to 2m−1 transitions. In this paper, we give two alternative algorithms for its construction and prove the constructed automata to be equivalent to the automata constructed by the a...

متن کامل

Error analysis of factor oracles

Factor oracles [1] constructed from a given text are deterministic acyclic automata accepting all substrings of the text. Factor oracles are more space economical and easy to implement than similar data structures such as suffix tree[6]. There is, however, some drawback; a factor oracle may accept strings not in the text, which we call a error acceptance. In this paper, we charactrize factor or...

متن کامل

title : Finding Maximal Repeats with Factor Oracles

Factor oracles, built from an input text, are automata similar to suffix automata, and accepting at least all substrings of the input text. In papers [LL00] and [LLA02], factor oracles are used to detect repeats on text. Although repeats found with these methods are not maximal, average error is very low and algorithm runs quite fast. In this paper, we present two ideas to improve accuracy of t...

متن کامل

Using Factor Oracles for Machine Improvisation

We describe variable markov models we have used for statistical learning of musical sequences, then we present the factor oracle, a data structure proposed by Crochemore & al for string matching. We show the relation between this structure and the previous models and indicate how it can be adapted for learning musical sequences and generating improvisations in a real-time context.

متن کامل

Weak Factor Automata: Comparing (Failure) Oracles and Storacles

The factor oracle [3] is a data structure for weak factor recognition. It is a deterministic finite automaton (DFA) built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, is homogeneous, and has m to 2m − 1 transitions. The factor storacle [6] is an alternative automaton that satisfies the same properties, except that its numbe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009